Closed-form two-locus sampling distributions: accuracy and universality.
نویسندگان
چکیده
Sampling distributions play an important role in population genetics analyses, but closed-form sampling formulas are generally intractable to obtain. In the presence of recombination, there is no known closed-form sampling formula that holds for an arbitrary recombination rate. However, we recently showed that it is possible to obtain useful closed-form sampling formulas when the population-scaled recombination rate rho is large. Specifically, in the case of the two-locus infinite-alleles model, we considered an asymptotic expansion of the sampling formula in inverse powers of rho and obtained closed-form expressions for the first few terms in the expansion. In this article, we generalize this result to an arbitrary finite-alleles mutation model and show that, up to the first few terms in the expansion that we are able to compute analytically, the functional form of the asymptotic sampling formula is common to all mutation models. We carry out an extensive study of the accuracy of the asymptotic formula for the two-locus parent-independent mutation model and discuss in detail a concrete application in the context of the composite-likelihood method. Furthermore, using our asymptotic sampling formula, we establish a simple sufficient condition for a given two-locus sample configuration to have a finite maximum-likelihood estimate (MLE) of rho. This condition is the first analytic result on the classification of the MLE of rho and is instantaneous to check in practice, provided that one-locus probabilities are known.
منابع مشابه
Closed-form Asymptotic Sampling Distributions under the Coalescent with Recombination for an Arbitrary Number of Loci.
Obtaining a closed-form sampling distribution for the coalescent with recombination is a challenging problem. In the case of two loci, a new framework based on asymptotic series has recently been developed to derive closed-form results when the recombination rate is moderate to large. In this paper, an arbitrary number of loci is considered and combinatorial approaches are employed to find clos...
متن کاملAn Asymptotic Sampling Formula for the Coalescent with Recombination By
Ewens sampling formula (ESF) is a one-parameter family of probability distributions with a number of intriguing combinatorial connections. This elegant closed-form formula first arose in biology as the stationary probability distribution of a sample configuration at one locus under the infinitealleles model of mutation. Since its discovery in the early 1970s, the ESF has been used in various bi...
متن کاملAn Asymptotic Sampling Formula for the Coalescent with Recombination.
Ewens sampling formula (ESF) is a one-parameter family of probability distributions with a number of intriguing combinatorial connections. This elegant closed-form formula first arose in biology as the stationary probability distribution of a sample configuration at one locus under the infinite-alleles model of mutation. Since its discovery in the early 1970s, the ESF has been used in various b...
متن کاملPadé Approximants and Exact Two-locus Sampling Distributions By
For population genetics models with recombination, obtaining an exact, analytic sampling distribution has remained a challenging open problem for several decades. Recently, a new perspective based on asymptotic series has been introduced to make progress on this problem. Specifically, closed-form expressions have been derived for the first few terms in an asymptotic expansion of the two-locus s...
متن کاملApproximate Sampling Formulae for General Finite-alleles Models of Mutation
Many applications in genetic analyses utilize sampling distributions, which describe the probability of observing a sample of DNA sequences randomly drawn from a population. In the one-locus case with special models of mutation, such as the infinite-alleles model or the finite-alleles parent-independent mutation model, closed-form sampling distributions under the coalescent have been known for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Genetics
دوره 183 3 شماره
صفحات -
تاریخ انتشار 2009